Statistical Programming with R

Very useful

You can access the course materials quickly from

https://laurencefrank.github.io/R

Some guidelines

  1. If you have a question, you may always interrupt me
  2. We will introduce frequent question breaks

Overview of this course

Program

HTML5 Icon

Goal of this course

1. Learn to programme in R

HTML5 Icon

and build the foundation for a successful scripting career.

2. Learn statistical programming

What is statistical programming?

Broadly speaking, computer programming differs from statistical programming in the following way:

  • Computer programming is more focused on software development.

  • Statistical programming is more focused on data analysis and the communication of the results.

HTML5 Icon

Source figure: R for Data Science

What is R?

Software

HTML5 Icon

The origin of R

  • R is a language and environment for statistical computing and for graphics

  • GNU project (100% free software)

  • Managed by the R Foundation for Statistical Computing, Vienna, Austria.

  • Community-driven

  • Based on the object-oriented language S (1975)

What is RStudio?

Integrated Development Environment

RStudio

  • Aggregates all convenient information and procedures into one single place
  • Allows you to work in projects
  • Manages your code with highlighting
  • Gives extra functionality (Shiny, knitr, markdown, LaTeX)
  • Allows for integration with version control routines, such as Git.

How does R work?

Objects and elements

  • R works with objects that consist of elements. The smallest elements are numbers and characters.

    • These elements are assigned to objects.
    • A set of objects can be used to perform calculations
    • Calculations can be presented as functions
    • Functions are used to perform calculations and return new objects, containing calculated (or estimated) elements.

The help

  • Everything that is published on the Comprehensive R Archive Network (CRAN) and is aimed at R users, must be accompanied by a help file.

  • If you know the name of the function that performs an operation, e.g. anova(), then you just type ?anova or help(anova) in the console.

  • If you do not know the name of the function: type ?? followed by your search criterion. For example ??anova returns a list of all help pages that contain the word ‘anova’

  • Alternatively, the internet will tell you almost everything you’d like to know and sites such as http://www.stackoverflow.com and http://www.stackexchange.com, as well as Google can be of tremendous help.

    • If you google R related issues; use ‘R:’ as a prefix in your search term

Assigning elements to objects

Assigning things in R is very straightforward: you just use <-

For example, if you assign the value 100 (an element) to object a, you would type:

a <- 100

Calling things in R is also very straightforward:

  • you just type the name you have given to the object

For example, we assigned the value 100 to object a. To call object a, we would type

a
## [1] 100

Writing code

HTML5 Icon

This is why we use RStudio.

Organise your work in RStudio

Use RStudio Projects

Every time you start a new data analysis project, create a new RStudio Project.

Because you want your project to work:

  • not only now, but also in a few years;
  • when the folder and file paths have changed;
  • when collaborators want to run your code on their computer.

RStudio Projects create a convention that guarantees that the project can be moved around on your computer or onto other computers and will still “just work”:

  • all code and outputs are stored in one set location;
  • relative file paths are created;
  • a clean R environment is created every time you open it.

Example data analysis project with RStudio project

HTML5 Icon

Every time you want to work on this project: open the project by clicking the .Rproj file.

Practical A

Time for your first practical in R!